Skip to main content
The gdal2metadata.py script generates Federal Geographic Data Committee (FGDC) compliant metadata XML files from GDAL-supported rasters.

Overview

gdal2metadata.py extracts spatial information from a georeferenced raster and populates an FGDC metadata template, automating the creation of standards-compliant metadata documentation.

Usage

in_Geo.tif
string
required
Path to input georeferenced raster file
in_FGDCtemplate.xml
string
required
Path to FGDC metadata XML template
output.xml
string
required
Path to output populated metadata XML
-debug
flag
default:"false"
Print detailed image information during processing
Basic Syntax:
python gdal2metadata.py in_Geo.tif in_FGDCtemplate.xml output.xml
With Debug Output:
python gdal2metadata.py -debug in_Geo.tif in_FGDCtemplate.xml output.xml

Workflow

1

Prepare FGDC template

Start with an FGDC metadata XML template containing your static metadata (title, abstract, contact information, etc.)
2

Run gdal2metadata

python gdal2metadata.py lunar_mosaic.tif template.xml output_metadata.xml
3

Verify output

Review the generated XML file to ensure all fields are correctly populated
4

Manual refinement

Edit the output XML to add or refine metadata that cannot be automatically extracted

What Gets Populated

The script automatically extracts and populates:

Spatial Information

  • Bounding coordinates
  • Coordinate system
  • Projection parameters
  • Datum and spheroid

Raster Properties

  • Image dimensions
  • Pixel resolution
  • Number of bands
  • Data type

Geotransform

  • Upper-left coordinates
  • Pixel spacing
  • Rotation parameters

Temporal

  • Current date/time stamp
  • Processing date

Template Requirements

Your FGDC template XML must:
  1. Conform to FGDC-STD-001-1998 (CSDGM - Content Standard for Digital Geospatial Metadata)
  2. Include placeholder elements for automatic population
  3. Contain complete static metadata (title, abstract, purpose, contact info, etc.)

Example Template Structure

<?xml version="1.0" encoding="UTF-8"?>
<metadata>
  <idinfo>
    <citation>
      <citeinfo>
        <title>Your Dataset Title</title>
        <pubdate>YYYYMMDD</pubdate>
      </citeinfo>
    </citation>
    <descript>
      <abstract>Dataset abstract...</abstract>
      <purpose>Purpose statement...</purpose>
    </descript>
    <spdom>
      <bounding>
        <westbc></westbc>  <!-- Auto-populated -->
        <eastbc></eastbc>  <!-- Auto-populated -->
        <northbc></northbc> <!-- Auto-populated -->
        <southbc></southbc> <!-- Auto-populated -->
      </bounding>
    </spdom>
  </idinfo>
  <spdoinfo>
    <direct>Raster</direct>
    <rastinfo>
      <rowcount></rowcount>  <!-- Auto-populated -->
      <colcount></colcount>  <!-- Auto-populated -->
    </rastinfo>
  </spdoinfo>
  <spref>
    <!-- Projection info auto-populated -->
  </spref>
</metadata>

Included Templates

The repository includes example templates:
File: USGS_DEM_template.tif.xmlDesigned for USGS Digital Elevation Models with standard USGS metadata sections.

FGDC Metadata Standard

About FGDC CSDGM

The Federal Geographic Data Committee’s Content Standard for Digital Geospatial Metadata (FGDC-STD-001-1998) defines:
  • Citation (title, publication date, authors)
  • Description (abstract, purpose)
  • Time period of content
  • Spatial domain (bounding coordinates)
  • Keywords and themes
  • Access and use constraints
  • Positional accuracy
  • Attribute accuracy
  • Logical consistency
  • Completeness
  • Lineage (source data and processing steps)
  • Raster vs. vector
  • Resolution/cell size
  • Number of bands/dimensions
  • Horizontal coordinate system
  • Projection parameters
  • Geodetic model (datum, spheroid)
  • Vertical coordinate system
  • Attribute definitions
  • Value domains and ranges
  • Units of measure
  • Distributor contact
  • Distribution format
  • Transfer options
  • Metadata date
  • Metadata contact
  • Metadata standard name and version

Implementation Details

XML Processing

The script uses Python’s XML libraries in order of preference:
  1. lxml.etree (preferred - most robust)
  2. xml.etree.cElementTree (Python 2.5+)
  3. xml.etree.ElementTree
  4. cElementTree
  5. elementtree.ElementTree
The script uses recursive search to find and populate template elements:
def recursive_search(element, tag_to_search, replacement_value):
    if element.tag == tag_to_search:
        element.text = replacement_value
    for child in element: 
        recursive_search(child, tag_to_search, replacement_value)
This allows flexible template structures without requiring exact element paths.

Requirements

pip install gdal lxml

Limitations and Notes

Proof of Concept Status: Version 0.2 is not extensively tested across all use cases.
  • FGDC version support: Currently only supports FGDC-STD-001-1998 (CSDGM)
  • Manual editing required: Static content (abstract, purpose, lineage, etc.) must be in your template
  • Validation recommended: Use FGDC metadata validation tools to verify output

Use Cases

Batch Processing

Automate metadata generation for large collections of rasters with similar characteristics

Data Publishing

Create compliant metadata for geospatial data portals and clearinghouses

Archive Preparation

Generate required metadata for long-term data archiving and preservation

Quality Assurance

Ensure spatial metadata consistency across multi-file datasets
While this tool generates FGDC metadata, be aware of related standards:
  • ISO 19115: International standard for geographic metadata
  • ISO 19139: XML schema implementation of ISO 19115
  • CSDGM → ISO conversion: Tools exist to convert FGDC to ISO format
For NASA planetary data, consider PDS4 metadata standards in addition to or instead of FGDC.

Credits

Author: Trent Hare, USGS (thare@usgs.gov)
Version: 0.2 (October 2011)
Based on: gdalinfo.py by Even Rouault and Frank Warmerdam
Released under MIT License. Use and modify freely for your metadata workflows.